Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't run UpdateObjectAuthority for Comments and Downtimes #6885

Merged
merged 1 commit into from
Jan 16, 2019

Conversation

Crunsher
Copy link
Contributor

This patch does not run UpdateObjectAuthority when creating Downtimes or Comments via the API. UOA takes up most of the time when creating a new object as it enumerates all objects. Downtimes are especially heavy as you often create multiple of them at a time.

In my tests creating downtimes for 15.000 Services took over 7 minutes (UOA runs 15.000 times, going over all ~30.000 objects), when not running UOA this went down to 50 seconds.

To make sure Downtimes still work without UOA, I ran the following test:

  1. Have a master-master zone (M1 & M2)
  2. Create Downtimes on M1s API
  3. Query Downtimes on M2 and see if they are active
  4. Success and Profit

About the implementation:
I am using type->GetName and compare Strings. This may seem ugly but is necessary in this case as type comparison would require libremote to link libicinga which means a dependency circle.

@Crunsher Crunsher added area/configuration DSL, parser, compiler, error handling area/api REST API labels Jan 10, 2019
@Crunsher Crunsher requested a review from dnsmichi January 10, 2019 11:01
@Crunsher
Copy link
Contributor Author

More detailed test protocol:
Zone 'double' with two masters, M1 running this patch, M2 running current stable. 15.000 Services in the zone.

script used:

time curl -k -s -u root:icinga -H 'Accept: application/json' -X POST 'https://M1:5665/v1/actions/schedule-downtime?type=Service' -d '{ "start_time": 1544613000, "end_time": 1754613000, "author": "icingaadmin", "comment": "ALL DOWN"}'
echo 
sleep 1
curl -k -s -u root:icinga -H 'Accept: application/json' -X GET 'https://M2:5665/v1/objects/downtimes' | jq

Cropped output (it's a lot of text)

time: 2:22.96

      "attrs": {
        "__name": "test0!test-666!2381432d-9190-415d-8a3f-78dda94c22e3",
        "active": true,
        "author": "icingaadmin",
        "comment": "ALL DOWN",
        "config_owner": "",
        "duration": 0,
        "end_time": 1754613000,
        "entry_time": 1547213512.082966,
        "fixed": true,
        "ha_mode": 0,

For comparison, without this patch scheduling those downtimes takes 12 minutes

@dnsmichi dnsmichi added this to the 2.11.0 milestone Jan 16, 2019
@dnsmichi
Copy link
Contributor

Thanks. It is ugly in this case, but larger rewrites in this region are TBD.

paused is never evaluated for these runtime objects, therefore updating the authority for all objects in an instance isn't needed.

@dnsmichi dnsmichi merged commit 6d7d523 into master Jan 16, 2019
@towolf
Copy link

towolf commented Jan 17, 2019

@Crunsher thanks! Running test with 29,000 services on our QA system. Been running for 10 min. under 2.10.2 already. Will check final times tomorrow and then test a master snapshot.

@towolf
Copy link

towolf commented Jan 17, 2019

Before patch result is: real 40m48.616s

command used:
time curl -k -u downtime:jU5... -H 'Accept: application/json' -X POST 'https://localhost:5665/v1/actions/schedule-downtime?type=Service' -d '{ "start_time": 1544613000, "end_time": 1754613000, "author": "icingaadmin", "comment": "ALL DOWN"}'

image

BTW, same command with remove-downtime is real 0m29.085s

@towolf
Copy link

towolf commented Jan 18, 2019

@Crunsher including your patch using build 2.10.2+164.g21a592a.2019.01.17+1.xenial-0 it now completes more than 10 times faster: real 3m26.083s

@dnsmichi dnsmichi deleted the fix/uoa-comment-downtime branch March 18, 2019 14:17
@dnsmichi dnsmichi modified the milestones: 2.11.0, 2.10.4 Mar 19, 2019
@dnsmichi dnsmichi added the bug Something isn't working label Mar 19, 2019
@dnsmichi
Copy link
Contributor

ref/IP/9235

@dnsmichi dnsmichi added the backported Fix was included in a bugfix release label Mar 19, 2019
@ekeih
Copy link
Contributor

ekeih commented Apr 1, 2019

This is awesome. >60k downtimes in 7 minutes instead of 1-2 hours. Thanks @Crunsher!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api REST API area/configuration DSL, parser, compiler, error handling backported Fix was included in a bugfix release bug Something isn't working ref/IP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants